Active Discriminative Text Representation Learning

نویسندگان

  • Ye Zhang
  • Matthew Lease
  • Byron C. Wallace
چکیده

We propose a new active learning (AL) method for text classification based on convolutional neural networks (CNNs). In AL, one selects the instances to be manually labeled with the aim of maximizing model performance with minimal effort. Neural models capitalize on word embeddings as features, tuning these to the task at hand. We argue that AL strategies for neural text classification should focus on selecting instances that most affect the embedding space (i.e., induce discriminative word representations). This is in contrast to traditional AL approaches (e.g., uncertainty sampling), which specify higher level objectives. We propose a simple approach that selects instances containing words whose embeddings are likely to be updated with the greatest magnitude, thereby rapidly learning discriminative, task-specific embeddings. Empirical results show that our method outperforms baseline AL approaches.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Visual-textual Attention Driven Fine-grained Representation Learning

Fine-grained image classification is to recognize hundreds of subcategories belonging to the same basic-level category, which is a highly challenging task due to the quite subtle visual distinctions among similar subcategories. Most existing methods generally learn part detectors to discover discriminative regions for better classification accuracy. However, not all localized parts are benefici...

متن کامل

Multimodal Learning with Deep Boltzmann Machines

A Deep Boltzmann Machine is described for learning a generative model of data that consists of multiple and diverse input modalities. The model can be used to extract a unified representation that fuses modalities together. We find that this representation is useful for classification and information retrieval tasks. The model works by learning a probability density over the space of multimodal...

متن کامل

Broadcast News Story Boundary Detection Using Visual, Audio and Text Features

News video story segmentation is vital for video summarization, story linking, and curation. We present a multimodal segmentation algorithm which fuses video, audio and text cues for story boundary detection. We show that broadcast news closed captioning is a rich and readily available source that improves story boundary detection. Furthermore, we propose an empirical distribution-based feature...

متن کامل

Unsupervised learning by completing partially occluded CAD renderings

Unsupervised learning has received a lot of attention in the recent past and has been applied to multiple domains images, videos, ego-centric motions, etc. In this project, we propose to use a large corpus of 3D CAD models to train an unsupervised representation. We follow the Deep Convolutional Generative Adversarial Network (DCGAN) framework to train a generative and a discriminative model. G...

متن کامل

Discriminative Cross-View Binary Representation Learning

Learning compact representation is vital and challenging for large scale multimedia data. Cross-view/crossmodal hashing for effective binary representation learning has received significant attention with exponentially growing availability of multimedia content. Most existing crossview hashing algorithms emphasize the similarities in individual views, which are then connected via cross-view sim...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017